Skip to content

buffer,string_decoder: consolidate encoding validation logic#7207

Closed
jasnell wants to merge 1 commit into
nodejs:masterfrom
jasnell:normalized-encoding
Closed

buffer,string_decoder: consolidate encoding validation logic#7207
jasnell wants to merge 1 commit into
nodejs:masterfrom
jasnell:normalized-encoding

Conversation

@jasnell

@jasnell jasnell commented Jun 7, 2016

Copy link
Copy Markdown
Member
Checklist
  • make -j4 test (UNIX) or vcbuild test nosign (Windows) passes
  • documentation is changed or added
  • the commit message follows commit guidelines
Affected core subsystem(s)

buffer, string_decoder

Description of change

Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

@nodejs/buffer @mscdex

@jasnell jasnell added buffer Issues and PRs related to the buffer subsystem. string_decoder Issues and PRs related to the string_decoder subsystem. labels Jun 7, 2016
@nodejs-github-bot nodejs-github-bot added the util Issues and PRs related to the built-in util module. label Jun 7, 2016
@mscdex

mscdex commented Jun 7, 2016

Copy link
Copy Markdown
Contributor

I don't think this will work for StringDecoder since some libraries override Buffer.isEncoding(), so we need to check the result of that function if we don't match a known internal encoding. Doing that as-is with these changes will mean executing the normalization function twice in the case that Buffer.isEncoding() isn't overridden.

@jasnell

jasnell commented Jun 7, 2016

Copy link
Copy Markdown
Member Author

Ok, let me take one more stab at it...

Comment thread lib/buffer.js Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just include it at the top of the file. This is already being required by other modules prior to this point so there's no savings.

@jasnell

jasnell commented Jun 7, 2016

Copy link
Copy Markdown
Member Author

@mscdex ... ok, how does that look now?

Comment thread lib/internal/util.js Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: this can be simplified to just return;

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I did it this way just to make it more explicit. Either way works for me tho

@mscdex

mscdex commented Jun 7, 2016

Copy link
Copy Markdown
Contributor

LGTM except for minor nit. Should probably run this through citgm though just to be extra safe.

@jasnell

jasnell commented Jun 8, 2016

Copy link
Copy Markdown
Member Author

CI: https://ci.nodejs.org/job/node-test-pull-request/2952/ ... Green except for unrelated build bot failure
CITGM: https://ci.nodejs.org/view/Node.js-citgm/job/citgm-smoker/300/ ... Green

Comment thread lib/internal/util.js Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to add the comment we may as well just use return undefined; ;-)

Comment thread lib/buffer.js Outdated

@trevnorris trevnorris Jun 8, 2016

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say either

  1. Prefix the Symbol to help prevent collisions (e.g. node.isEncoding)
  2. Create and export this symbol from internal/util.js so it's no longer global
    I lean towards the later.

@jasnell

jasnell commented Jun 8, 2016

Copy link
Copy Markdown
Member Author

@trevnorris @mscdex ... updated and added a couple of tests. It turns out that we were not previously testing or verifying that Buffer.isEncoding() with non-string values would return false.

@trevnorris

Copy link
Copy Markdown
Contributor

LGTM

@mscdex

mscdex commented Jun 11, 2016

Copy link
Copy Markdown
Contributor

LGTM if CI is still ok: https://ci.nodejs.org/job/node-test-pull-request/2988/

@jasnell

jasnell commented Jun 12, 2016

Copy link
Copy Markdown
Member Author

That CI run had a couple buildbot issues. Running again just to be safe
https://ci.nodejs.org/job/node-test-pull-request/2991/

Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.
@jasnell jasnell force-pushed the normalized-encoding branch from 873a028 to 7b459a9 Compare June 21, 2016 15:50
@jasnell

jasnell commented Jun 21, 2016

Copy link
Copy Markdown
Member Author

New CI before landing... just in case.. https://ci.nodejs.org/job/node-test-pull-request/3038/

jasnell added a commit that referenced this pull request Jun 21, 2016
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: #7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
@jasnell

jasnell commented Jun 21, 2016

Copy link
Copy Markdown
Member Author

Landed in 6dd093d

@jasnell jasnell closed this Jun 21, 2016
@thefourtheye

Copy link
Copy Markdown
Contributor

Returning different types of values from a function could have been avoided :(

@mscdex

mscdex commented Jun 21, 2016

Copy link
Copy Markdown
Contributor

@thefourtheye What do you mean?

@thefourtheye

Copy link
Copy Markdown
Contributor

@mscdex normalizeEncoding returns string and undefined based on the inputs, right? I was thinking, this could have been avoided.

@mscdex

mscdex commented Jun 21, 2016

Copy link
Copy Markdown
Contributor

@thefourtheye How could it have been avoided?

@thefourtheye

thefourtheye commented Jun 21, 2016

Copy link
Copy Markdown
Contributor

@mscdex maybe we could have returned an empty string instead of undefined?

@mscdex

mscdex commented Jun 21, 2016

Copy link
Copy Markdown
Contributor

@thefourtheye Returning undefined or an empty string both have their pros and cons so there's really no difference IMHO, and I believe any performance hit from not consistently returning a string is likely to be negligible in this case.

@Fishrock123

Copy link
Copy Markdown
Contributor

seems to have conflicts on v6

@jasnell

jasnell commented Jun 27, 2016

Copy link
Copy Markdown
Member Author

@Fishrock123 ... this PR builds on @mscdex's work in #6777. This can be backported to v6 if #6777 is. I don't believe a don't-land label is appropriate but we might need a better way of tracking dependencies between PRs.

@mscdex

mscdex commented Jun 27, 2016

Copy link
Copy Markdown
Contributor

@jasnell That PR is already in v6, it's actually in v6.2.1.

@MylesBorins

Copy link
Copy Markdown
Contributor

I put the don't land label for v4.x please correct if wrong

addaleax added a commit to addaleax/node that referenced this pull request Sep 7, 2016
Fishrock123 pushed a commit to Fishrock123/node that referenced this pull request Sep 8, 2016
steal a line from nodejs#7207 to make things work

PR-URL: nodejs#8437
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
Fishrock123 pushed a commit to Fishrock123/node that referenced this pull request Sep 8, 2016
steal a line from nodejs#7207 to make things work

PR-URL: nodejs#8437
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
addaleax pushed a commit to addaleax/node that referenced this pull request Sep 9, 2016
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: nodejs#7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>
Fishrock123 pushed a commit that referenced this pull request Sep 13, 2016
Buffer.isEncoding and string_decoder.normalizeEncoding shared
quite a bit of logic. This moves the primary logic into
internal/util. The userland modules that monkey patch Buffer.isEncoding
should still work.

PR-URL: #7207
Reviewed-By: Brian White <mscdex@mscdex.net>
Reviewed-By: Trevor Norris <trev.norris@gmail.com>

Refs: #8463
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Jeremiah Senkpiel <fishrock123@rocketmail.com>
BridgeAR added a commit to BridgeAR/node that referenced this pull request Mar 2, 2018
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
MayaLekova pushed a commit to MayaLekova/node that referenced this pull request May 8, 2018
Due to code consolidation in nodejs#7207
the isEncoding function got less strict. This commit makes sure
isEncoding returns false for empty strings as before the consolidation.

PR-URL: nodejs#18790
Refs: nodejs#7207
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

buffer Issues and PRs related to the buffer subsystem. string_decoder Issues and PRs related to the string_decoder subsystem. util Issues and PRs related to the built-in util module.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants